Using Zero Anaphora Resolution to Improve Text Categorization

نویسندگان

  • Ching-Long Yeh
  • Yi-Chun Chen
چکیده

In Chinese, anaphors are frequently omitted, termed zero anaphor (ZA), from text due to their prominence. Thus the information carried by ZAs in text can not be used to contribute the calculation of text categorization. In this paper, we employ a ZA resolution method to recover the omissions of anaphors in text. Then the resulting text is used as the input of a text categorization system. The experiment result shows that ZA resolution method enhances the accuracy of text categorization from 79% to 84%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Anaphora Resolution: To What Extent Does It Help NLP Applications?

Papers discussing anaphora resolution algorithms or systems usually focus on the intrinsic evaluation of the algorithm/system and not on the issue of extrinsic evaluation. In the context of anaphora resolution, extrinsic evaluation concerns the impact of an anaphora resolution module on a larger NLP system of which it is part. In this paper we explore the extent to which the well-known anaphora...

متن کامل

A Discriminative Approach to Japanese Zero Anaphora Resolution with Large-scale Lexicalized Case Frames

We present a discriminative model for Japanese zero anaphora resolution that simultaneously determines an appropriate case frame for a given predicate and its predicate-argument structure. Our model is based on a log linear framework, and exploits lexical features obtained from a large raw corpus, as well as non-lexical features obtained from a relatively small annotated corpus. We report the r...

متن کامل

ZAC: Zero Anaphora Corpus A Corpus for Zero Anaphora Resolution in Portuguese

This paper describes a corpus of Brazilian Portuguese texts built in view of the construction of an Anaphora Resolution system, which is part of a fully-fledged Natural Language Processing system (STRING). The ZAC corpus is aimed at the resolution of the so-called zero-anaphora, that is, an anaphora relation where the anaphoric expression (or anaphor) has been zeroed The paper briefly discusses...

متن کامل

Intra-sentential Zero Anaphora Resolution using Subject Sharing Recognition

In this work, we improve the performance of intra-sentential zero anaphora resolution in Japanese using a novel method of recognizing subject sharing relations. In Japanese, a large portion of intrasentential zero anaphora can be regarded as subject sharing relations between predicates, that is, the subject of some predicate is also the unrealized subject of other predicates. We develop an accu...

متن کامل

An Empirical Study of Zero Anaphora Resolution in Chinese Based on Centering Model

In this paper, we describe the creation of Chinese zero anaphora resolution rules by performing experiments. The rules were constructed based on the centering model. In the experiments, we selected several texts as testing examples. We compared the referents of zero anaphors in the testing texts identified by hand with the ones resolved by using an algorithm employing a resolution rule. Three r...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003